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ABSTRACT 

Among opportunities to advance the sto^te of the art 
of intelligent computer-assisted instruction (ICAI) are the 
evaluation of ICAI systems and the use of the underlying technology 
in ICAI systems to develop tests. Each issue is addressed via its 
theoretical context, key constructs, appropriate references to the 
literature, methodological aspects, and concrete examples of the 
feasibility of resolving the issue. ICAI systems use artificial 
intelligence and cognitive science to reach a range of subject 
matters. Several computer programs are discussed. The key components 
of ICAI systems include a knowledge base, a student model, and 
instructional techniques for teaching declarative or procedural 
knowledge. Research that has contributed to the development of ICAI 
includes research into both formative and summative evaluation, 
measurement o£ student achievement outcomes, measurement of 
individual differences among students, and process measurement and 
analysis. A list of 75 references is presented. (TJH) 
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In this chapter we plan to explore two issues in the field of intelligent computer 
assisted instruction (ICAI) that we feel offei opportunities to advance the state of the an. 
These issues are evaluation of ICAI systems and the use of tiie underlying technology i 
ICAI systems to develop tests. For each issue we will provi. a theoretical context, 
discuss key constructs, piovide a brief window to the appropriate literature, suggest 
methodological solutions and conclude with a concrete example of the feasibility of the 
solution from our own research. 

Intelligent Computer-assisted Instruction (ICAI) 

ICAI is the application of artificial intelligence to computer-assisted instruction. 
Artificial intelligence, a branch of computer science, is making computers smart in order to 

(a) make them more useful and (b) understand intelligence (Winston, 1977). Topic areas in 
artificial intelligence have included natural language processing (Schank, 1980), vision 
(Winston, 1975), knowledge representation (Woods, 1983), spoken language (Lea, 1980), 
planning (Hayes-Roth, 1980), and expert systems (Buchanan, 1981). The field of 
Artificial Intelligence (AI) has matured in both hardware and software. The most 
commonly used language in the field is LISP (List Processing). A .najor development in 
tiie hardware area is tiiat personal LISP machines are now available at a relatively low cost 
(20-50K) with tiie power cf prior mainframes. In tiie software area two advances stand 
out: (a) programming support environments such as LOOPS (Bobrow & Stefik, 1983) and 

(b) expert system tools. The application of "expert systems" technology to a host of real- 
world problems has demonstrated the utility of artificial inteUigence techniques in a very 
dramatic style. Expert system technology is die branch of artificial intelligence at this point 
most lelevant to ICAI. 

Expert Systems 

Knowledge-based systems or expert systems are a collection of problem-solving 
computer programs containing botii factual and experiential knowledge and data in a 
particular domain. When die knowledge embodied in die program is a result of a human 
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expert elicitation, these systems are called expert systems. A typical expert system 
consists of a knowledge base, a reasoning mechanism popularly called an "inference 
engine" and a "friendly" user interface. The knowledge base consists of facts, concepts, 
and numerical data (declarative knowledge), procedures based on experience or mles of 
thumb (heuristics), and causal or conditional relationships (procedural knowledge). The 
inference engine searches or reasons with or about the knowledge base to arrive at 
intermediate conclusions or final results during the course of problem solving. It 
effectively decides when and what knowledge should be applied, applies the knowledge 
and determines when an acceptable solution has been found. The inference engine employs 
several problem-solving strategies in arriving at conclusions. Two of the popular schemes 
involve: Starting with a good description or desired solution and working backwards to the 
knowr facts or current situation (backward chaining); and starting with the current situation 
or known facts and working toward a goal or desired solution (forward chaining). The 
user interface may give the user choices (typically menu driven) or allow the user to 
participate in the control of the process (mixed initiative). The Lnterface allows the user: to 
describe a problem, input knowledge or data, browse through the knowledge base, pose 
question, review 'Jie reasoning process of the system, intervene as necessary, and control 
overall system operation. Successful expert systems have been developed in fields as 
diverse as mineral exploration (Duda & Gaschnig, 1981) and medical diagnosis (Clancy, 
1981). 

ICAI Systems 

ICAI systems use approaches artificial intelligence and cognitive science to teach a 
range of subject matters. Representative types of subjects include: (a) collection of facts, 
e.g.. South American geography in SCHOLAR (Carbonell & Collins, 1973); (b) compleie 
system models, e.g., a ship pi^opulsion system in STEAMER (Stevens & Steinberg, 1981) 
and a power supply in SOrHIE (Brown & Burton, 1978); (c) completely described 
procedural mles, e,g., strategy learning, WEST (Brown, Burton, & de Kleer, 1982), or 
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arithmetic in BUGGY (Brown, Burton, & Larkin, 1977); (d) pania!ly described procedural 
rules, e.g., computer programming in PROUST (Johnson & Soloway, 1983); LISP Tutor 
(Anderson, et al., 1985); rules in ALGEBRA (McArthur, Stasz & Hotta, 1987); diagnosis 
of infectious diseases in GUIDON (Clancey, 1979) ; and an imperfectly understood 
complex domain, causes of rainfall in WHY (Stevens, Collins, & Goldin, 1978). 
Excellent reviews by Bair and Feigenbaum (1982) and Wenger (1987) document many of 
these ICAI systems. Representative research in ICAI is described by O'Neil, Anderson 
and Freeman (1986). 

Although suggestive evidence has been provided by Anderson et al. (1985), few of 
these ICAI projects have been evaluated in any rigorous fashion. In a sense they have all 
been toy systems for research and demonstration. Yet, Lhey have nonetheless raiser', a good 
deal of excitement and enthusiasm about their likelihood of he^rg effective instractional 
environments. 

With respect to cognitive science, progress has been made in the following areas: 
identification and analysis of misconceptions or "bugs" (Clement, Lockliead, & Soloway, 
1980), the use of learning strategies (O'Neil & Spielberger, 1979; Weinstein & Mayer, 
1986), expert versus novice distinction (Chi, Glaser, & Rees, 1982). the role of mental 
models in learning (Kieras & Bovair, 1983), and the role of self-explanations in problem 
solving (Chi, Bassok. Lewis, Reimann, & Glaser, 1987). 

The key components of an ICAI system consist of a knowledge base: that is, (a) 
what the student is to learn; (b) a student model, either where the student is now with 
respect to subject matter or how student characteristics interact with subject matter and (c) a 
tutor, that is, instructional techniques for teaching the declarative or procedural knowledge. 
These components are described in more detail by Fletcher (1985). 

Knowledge Base. This is the "expert" part of the system. Ideally, this component 
would represent the relevant knowledge domain. In effect, it must contain the knowledge 
and understanding of a subject matter expert. It must be able to generate problem solutions 
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However, little of instructional design considerations (e.g., Ellis, Wulfeck &. 
Fredericks, 1979; Markle, 1967; Merrill & Tennyson, 1977; O'Neil, 1979; Park, Perez & 
Seidel, 1987; or Reigeluth, 1987) are reflected in ICAI tutors. Instructional design is 
concerned with "prescribing optimal methods of instruction to bring about desired changes 
in student knowledge and skills" or alternatively is viewed as a "linking science ... a body 
of knowledge that prescribes instructional actions to optimize designed instructional 
outcomes, such as achievement and affect" (Reigeluth, 1983). More recently, there have 
been several systematic attempts to provide instructional infcrmaticn into the design of 
ICAI systems. Such attempts include the design of a new ICAI mtor (O'Neil, Slawson, &. 
Baker, 1987) and the design of instructional strategies to improve existing ICAI progr?jns 
(Baker et al., 1985). However, neither of these efforts systematically evaluated the 
resulting "improved" ICAI programs. Research in progress by Mc Arthur of the Rand 
Corporation is addressing this issue in the domain of algebra.. 
Evaluation 

Evaluation is an activity purported to provide an improved basis for decision-making. 
Among its key elements are the identification of goals, the assessment of process, the 
collection of information, analysis, and the interpretation of findings. A critical issue in 
any sort of evaluation is the meaning ascribed to the fmdings. Meaning derives from the 
use of measures that are valid for the intervention, from the adequacy of the inferencing 
processes used to interpret results, and from the utility of tlie findings for the intended 
users. These facets of meaning require that the designer/developer as weL' as funding 
sources articulate their goals, processes, and potential decision needs so that the evaluation 
team can provide results that have meaning for Lnterested parties. 

Summative Evaluation. The most common model for evaluation is sununative 
(Scriven, 1967) which focuses on overall choices among systems or programs based upon 
perfomiance levels, time, and cost. In this mode, evaluation is essentially comparative and 
contrasts the innovation against other options. These comparisons may be against explicit 
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choices or may be implicit in terms of current practice or ways resources might be spent in 
the future (opportunity costs). 

Summative evaluation asks the question, "Does the intervention work?" In a military 
or industrial training environment, a common question is, "Has training using X approach 
been effective?" Implicit in that question is comparison, for the intervention must be 
judged in comparison to other alternatives, either current practice, or hypothetically, in 
terms of other ways the resources could be used. A second part of the summative 
evaluation question is "How much does it cost?" Again, comparisons may be implicit or 
explicit. Third, summative evaluation develops information related to a third critical 
question: "Should we buy it?" Here, the issue is the confidence we have in our data, and 
the validity of the inferences we draw from such data. We judge the credibility of oui cost 
information case against the validity and credibility of quality data and cost of competing 
alternatives. 

Where summative evaluation is weak is in identifying what to do if a system or 
intervention is not an immediate, unqualified success. Given that this state is most 
common for most interventions in early stages of development, comparative, summative 
type evaluations are usually mis-timed and may create an unduly negative environment for 
productivity. Furthermore, because summative evaluation is typically not designed to 
pinpoint weaknesses and to explore potential remedies, it provides almost no help in the 
developmenl/inprovement cycle which characterizes the systematic creation of training 
interventions. 

Formative Evaluation. Evaluation efforts that are instituted at the outset or in the 
process of an innovation's development typically have different purpose. Formative 
evaluation (Baker, 1974) seeks to provide information that focuses on the improvement of 
the innovation and is designed to assist the developer. 

Formative evaluation also addresses, from a metaevaluation perspective, the 
effectiveness of the development procedures used, in order to predict whether the 
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application of similar approaches will likely have effective and efficient results. In that 
function, fonnative evaluation seeks to improve the technology at large, rathei than the 
specific instances addressed one at a time. The approach, formative evaluation, is designed 
so diat its principal outputs are identification of success and failure of segments, 
components, and details of programs, ratlier than a simple overall estimate of project 
success. The approach requires that data be developed to permit the isolation of elements 
for improvement and, ideally, the generation of remedial options to assure that subsequent 
revisions have a higher probability of success. Formative evaluation is a method that 
developed to assist in the development of instructional (training) programs. While the 
evaluation team maintains "third-pany" objectivity, they typically uiteract with and 
understand program goals, processes, and constraints at a deeper level than evaluation 
teams focused exclusively on bottom line assessments of .success or failure. Their intent is 
to assist their client (either funding agency or project stafQ to use systematic data collection 
to promote the improvement of the effort. 

Basic literature in formative evaluation was developed by Scriven (1967), Baker and 
AUdn (1973), Baker (1974), and Baker and Saloutos (1974). Formative evaluation now 
represents the major focus of evaluation efforts in the public education sector (Baker & 
Herman, 1985) in the guise of instructional management systems. Multiple models and 
procedures arc common within formative evaluation. An example of one approach tr 
formative evaluation for ICAI is depicted in Figure 2. As is shown in Figure 2, formative 
evaluation begins with checking whether the design is congruent with specifications and 
ends with revision which includes new data collection on Steps 3-5. An attempt to use this 
approach was conducted by Baker et al. (1985). 

Insert Figure 2 about here 

Tensions in Evaluation. A persistent fact of evaluation is that those evaluated 
rarely see the value of the process. It is something done to them, a n xessary evil, a new 
chance for failure, often seen as largely irrelevant to their major purpose. This view 
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generally holds whether it is a person who is evaluated (for selection or credentialling 
purposes), such as students and teachers at universities or in the public schools, a program 
evaluated (either as small as a segment or as large as a federal initiative), or a technological 
innovation. Those who get evaluated are almost always reluctant players. 

As persistent a fact, however, is that those in authority have come to believe that 
evaluation is a useful process. Their belief is fostered in part by actual research studies 
showing that evaluation findings, whe,i used, improve the state of affairs. But a more 
likely reason that evaluation has been fastened upon as a useful endeavor resides in the 
belief that it provides a mechanism for management, or for the appearance of management, 
by those in charge of resources. Objectivity, accountability, and efficiency are themes 
underlying tliis commitment to evaluation. 

The tension is obvious between those who must participate and tuose who push the 
evaluation process fix)m positions of authority. Evaluation experts have to mediate among 
these two sets of views, a challenging, if not always pleasant task. 

The Evaluability of ICAI Applications. Evaluating an emerging technology 
presents serious technical as well as practical problems, and the ICAI field incorporates 
most known or imaginable difficulties. First, much has been claimed by proponents of 
Artificial Intelligence (AI). The claims hrve led many sponsors to support projects that 
they believe intend to produce a fully developed instructional innovation (such as a tutor). 
In fact, the intention of the designers may not be to create a working, effective tutor, but to 
work toward this goal and thereby to explore the limits of the computer science field. In 
thia case, the tutor becomes a context for R&D, a constraint under which the designer really 
seeks to conduct research, i.e., produce new knowledge about AI processes. Such a 
process makes sense in an emerging field but requires great patience from sponsors. 

Because ICAI efforts develop largely in a research rather than in a development 
context, certain facts characterize them. First, research goals contributing to knowledge 
and theory building appear to be paramount. Focusing on academically respectable efforts 
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frequently characterizes emerging, synthetic fields. (See, for instance, the spate of theory 
building in educational evaluation in the late 60's,) Second, efforts are selectively 
addressed based on the research predilecaons (rather than the project development 

?ments) of any particular set of investigators. Third, there are no real "off-the-shelf- 
item" components available for easy substitution into the project. Thus, if the researcher 
invests effort in knowledge representation, his final product may not "work" because of the 
lagged emphasis in another important component, e.g., a tutor. The foreknowledge of 
uncertain success to the researcher need not impair the ICAI enthusiasm. Again, rhetoric of 
the goal of a complete ICAI system is useful. In an emerging field, breakthroughs are 
anticipated. Secondly, keeping the idea, even as an idea, of a complete future ICAI in the 
mind of the researcher suggests fruitful paths of exploration. 

Thus, the lines between research and application in ICAI are murky and unaercut ^t*at 
categories of R&D processes, such as those identified by Glennan (1968) and Bright 
(1970) and used as program elements in DoD work (Basic Research [6.1], Explomtory 
Development [6.2], Advanced Development [6.3], and Engineering Development [6.4]), 
This reality presents problems for evaluation. Compared to other innovations, the ICAI 
what to be evaluated is less concrete and identifiable, and more like the probabilistic view 
oi where a photon is at any point in time. In addition, the field of ICAI uses multiple 
metaphors to describe its activity. Figure 3 depicts these multiple metaphors. We believe 
that each setting requires a different role for the student and, thus, a different evaluation 
focus. 

Insert Figure 3 about here 
Secondly, ICAI has e^aluability problems partly because of its visibility; the public 
persona of AI (see national magazines, films, television, trade books) is high profile. In 
startling contrast, the accessibility to AI processes is limited. To the uninitiated, it is 
embedded in the recesses of special language (e.g., LISP, PROLOG) and in arcane jargon 
(modified petri net, overlay models). Coupled with the fact that AI work is conducted in a 
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relatively few centers by a relatively small number of people, understanding an AI 
implementation well enough to create sensible options for its assessment is a difficult 
proposition. These states are compounded by the strongly capitalistic environment in 
which AI rescpxh is conducted. The proprietary nature of much work, either that 
conducted by lar^e private corporations or by small entrepreneurial enterprises also works 
to obscure the conceptual and procedural features of the work. AI applications are unlike, 
therefore, innovations in health, crim-nal justice, education, industrial training, 
employment, or transportation because of the lack of mid-level communication about what 
the innovation actually is. Perhaps AI experts can assist in evaluation, but, 
understandably, they are more interested in creating something new of their own. All of 
^his is asst rted with full knowledge that at least some of these problems characterize any 
rapidly developing new technology. 

The utility of evaluation processes also needs to be judged in terms of what 
techniques and options are useful where there is differential confidence in our ability to 
measure and infer, and which procedures have been used credibly in the last ten years. In 
addition, we must consider what requirements ICAI evaluj^ a m creates and explore new 
methodology to meet these needs. We have begun to develop such a methodology. Table 
1 presents questions we believe that an ICAI evaluation should answer and thus increase 
the evaluability of ICAI, 

Insert Table 1 about here 

Distance Bet\?een the Evaluator ^nd the Evaluated. One way to think about 
either formative or summative evaluation techniques is in terms of the distance among those 
who arc conducting the evaluation work those responsible for the actual day-to-day design 
and development of the project, and those who are responsible for providing resources to 
the project. These distances are often represented as the "party" of the evaluation. 

First party evaluation is evaluation conducted by the project staff itself. Common 
examples would be pilot test data conducted for input into the design of a final project. It 
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has the benefit of intimate connection and understanding of the project. Its problem is lack 
of distance and detachment. In AI applications, this evaluation work is informal, and 
relatively infrequently addressed to the issue o'^ overall effectiveness of the intervention. 
Further, many ICAI projects are conceptualized to advance the state of the art in computer 
science (a view of the developer). This perspective may conflict with the view of the 
funder of a project to create an ICAI system with of an instructionally sound tutor. 

Second party evaluation involves the assessment of progress or outcomes by the 
supervising funding agency. IPRs and site visits are examples of second party evaluation. 
Arbitrary timing, limited agency attention spans, and objectivity are problems here. 
Further, a real intellectual give and take is difficult when agency personnel control funds. 

Third part> evaluation is evaluation conducted by an independent group. GAO 
nerforms many third party summative evaluations. Independent contractors reporting to 
state legislatures, school boards or school districts also conduct such evaluation. The 
benefit of such an approach is the disinterested nature of the investigation, contributing to 
the credibility of the findings. However, the validity of external evaluation presents some 
difficulty, and requires, however, that the third party get up to speed in technical issues so 
that the evaluation methodologies applied are appropriate. The learning required by the 
evaluation staff represents an additional "overhead" to the project staff and may be 
perceived as a distiaction from their primary effort. This soit of evaluation costs more than 
the other two 

All types of above evaluation van be done using formative or summative techniques. 
Third party formative evaluations are rare in general and to our knowledge have only been 
applied once in ICAI (Baker et al., 1985). 

Evaluation Technology. Contrary to popular practice, mere is no inherent reason 
for totally separating formative and summative evaluation efforts. We have mentioned that 
the approaches differ in purpose and client. They also differ in the types of data 
appropriate (cost for summative, coinponential analysis for formative). However, in the 
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are.' of performance, they should share some common procedures and criterion measures. 
In addition, since ICAI shares some common attributes with CAI, evaluation technology 
ap- date to CAI could be used in ICAI (e.g., Merrill et al., 1986; Alessi & Trollip, 
1985). The CAI lesson evaluation techniques in Table 2 present some formativt (quality 
review and pilot testing methods) and some summative techniques (i.e., validation). These 
activities were adapted from Alessi and Trollip (1985). Information of this sort is a 
necessary but not sufficient set for ICAI evaluation. What is missing in Table 2 and needs 
to be developed for ICAI are specuic procedures Liat focus on the unique attributes of 
ICAI. Table 3 provides a Pxst cut of such attributes. To our knowledge, there are no 
known techniques to evaluate systematically and instructionally the features in Table 3. 
However, an interesting approach for the analysis of rapid prototyping is provided by 
Carroll and Rosson (1984), and Richer (1985) discusses knowledge acquisition 
techniques. 

Insert Table 2 about here. 
Insert Table 3 about here 

It is not likely that evaluation as it is currently practiced can be transferred directly to 
an application field such as ICAI. One approach to exploring the merging of existing 
technologies (ICAI applications with evaluation technology) is to shift points of view in 
order to determine where reasonable matches exist. Looking first from the evaluation 
perspective, let us explore where evaluation has some strengths and could make a 
substantial ccntribution to ICAI development. 
Evaluation's Contribution to ICAI 

Research and development in measurement is one of the major productive areas in 
psychology. Sophisticated models for estimating performance have been developed and 
come in and out of vogue. Many of these were created to assist in the selection process, to 
sort those individuals who were better or worse with regard to a particular competency or 
academic domain. However, these approaches, while venerable, have little to contribute to 
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the evaluation of programs, either those completed or under continuing development. Most 
standardized achievement tests were based on this model, and their use to evaluate 
innovation is not recommended for a variety of technical reasons. These reasons can be 
summed up on a simple phrase: standardized tests are not sensitive enough to particular 
curriculum foci; thus, they are unlikely to detect effects present (the false negative problem) 
and will underestimate effects that exist 

• Measurement of Student Achievement Outcomes. However, there are 
newer approaches to the measurement of human performance which do have implications 
for the assessment of ICAI interventions designed to improve learner performance. 
Specifically, the use of domain-referenced achievement testing seems to provide a good 
match with ICAI approaches. In domain-referenced testing (Hively, Patterson, & Page, 
1968; Baker & Herman, 1983; Baker & O'Neil, 1987) one attempts to estimate student 
performance in a well-specified content domain. The approach is essentially top-down, 
with parameters for content selection and criteria forjudging adequacy of student output 
specified (albeit successively revised) in advance. Test items are conceived as samples 
from a universe constrained by the specific parameters. For example, in the area of reading 
comprehension, parameter would need to be explicated regarding the genre ar.d content to 
be read, the characteristics of the semantics and syntax, including variety, ambiguity, 
complexity of sentence patterns, and the presupposed knowledge that the learner would 
bring into the instructional/tesdng setting. In addition, the characteristics of the items 
would be identified, in terms of gross format, i.e., short answer, essay, multiple choice, 
and in terms of subtler features such as the rules for the construction of wrong answer 
alternatives, or for tiie assessment of free responses. Theoretically, such rules permit the 
generation of a universe of test items which can be matrixed resampled to provide progress 
and end-of -instruction testing. 

The use of such approaches have the added benefit of utility to small numbers of 
students. They do not depend, as does the selection approach described above, upon 
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normal (and large) distributions of respondents to derive score meaning. On the other 
hand, such tests are more demanding to develop, and they depend upon close interaction 
with the innovation designer to assure that the specifications arc adequate. They contrast 
with the common approach of "tacking on" existing measures (like commercially available 
standardized tests), an easy enough process but one unlikely to provide information useful 
for the fair assessment of improvement of a product Domain-referenced tests derive their 
power from the goodness of their specifications. Their weakness is their idiosyncrasy; 
however, the matching of testing pnxedures to designer's intentions is also their strength. 

Because of the attention that ICAI applications devote to representing properly the 
knowledge domain and determining student understanding in process, the application of 
improved assessment techniques, particularly those based on domain -referenced testing 
seems like a good fit. 

• Measurement of Individual Difference. A second area in measurement 
that could contribute to the efficient design and assessment of ICAI applications is the 
measurement of individual differences. Psychology has long invested resources in 
determining how best to assess constructs along which individuals show persisting 
differences. For these areas to be useful, such constructs should interact (statistically) with 
instructional options and desired outcomes of the system under study (Como & Snow, 
1986). Common constructs such as ability and intelligence undoubtedly have relevance for 
the analysis and implementation of alternative student models and tutoring strategies. Other 
constructs related to cognitive style preferences, e.g., the need for structure, the need for 
reflection, the attribution of success and failure, could illuminate design options and results 
a ialyses for ICAI applications. Similarly, constructs related to affective states, i.e., state 
anxiety (Hedl & O'Neil, 1977), could also provide explanations of findings otherwise 
obscure. 

• Process Measurement and Analysis. In formative evaluation, much is 
made of the role of process evaluation, that is, tracking what occurs when, t.> assure that 
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inferences about system effectiveness are well placed. Central to this function, however, is 
deciding, to the extent possible, what data should be collected and which inferences should 
be drawn from the findings. Technology-ba^ed innovations often make two seemingly 
conflicting classes of errors. One error is collecting everything possible that can be 
tracked. Student response times, system operation, errors, student requests, etc., can be 
accumulated ad nauseum. The facts seem to be that rarely do developers attend to this glut 
of information. They have no strategies for determining how such data should be arranged 
in priority, nor ways to draw systematic conclusions from findings. By the time the 
database is assembled, developers are often on to new ideas and prospects; old data, 
particularly painfully analyzed and interpreted old (to the developer) data, remain only old 
and often unused The other error in technology process, measurenx . is when relevant 
information which could be painlessly accumulated and tabulated on-line is ignwed. 

The challenge for the evaluator is to help decide what data are likely to be most 
relevant. Relevance will presuppose a clea^ overall goal, such as teaching a target group a 
set of skills. In fact, in the entire gamut of measurement options availiible, the most 
significant contributions evaluators may make is clarifying the goals that the designer 
possesses but has not articulated. Because of the mixture of research and development 
goals inherent in much ICAI work in education, this is a nontrivial problem. The designers 
may feel they have all the goals they can tolerate. 

• Generation of Instructional Options. Formative evaluators can assist ICAI 
designers to explore different ways in which they can successfully meet their goals. Of 
particular inte^st, for example, is the extent te which evaluation can highlight alternatives 
for the instructional strategies used in the application. In all instructional development, not 
the least in ICAI-based approaches, the designer fastens early upon a particular strategy. 
Research findings have suggested that teachers and developers are most reluctant to change 
the approach they have taken. They will play at the edges rather than rethink their overall 
method (Baker, 1976). Furthermore, they could easily adapt their basic approach by 
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adding particular instructional options to their basic plan, assuming that they make tl ^Ix 
choice informed by prior research. A recent study (Baker, Bradley, Ashbacher, & Feifer, 
1985) adopted such an approach and experimentally modified WEST to strengthen its 
teaching capability. Although largely unsuccessful due to implementation issues, it 
demonstrated the feasibility of the concept. 
Formative Evaluation of ICAI: A Case Study 

This section will focus on the Baker et al. (1985) fcmiative evaluation of PROUST 
as an example of a formative evaluation of ICAI. PROUST (Johnson & Soloway, 1983, 
1987) was selected by Baker et al. (1985) as one of the projects to formatively evaluate 
because its designers conmunicated serious interest in whether PROUST v/as 
instructionally effective with students. 

Evaluation Focus. A three-phase evaluation teii.plate was designed for use in the 
project evaluation. The first phase of the evaluation included an attempt to understand the 
"product'' development cycle employed, the ideological orientations of the designers, and 
their stated intentions. A second phase of analysis involved reviewing the internal 
characteristics of the ICAI systems from two perspectives: first, the quality of the 
instmctional strategies employed; and second, the quality of the content addressed. A third 
and major phase of the study was empirical testing of the programs. Here, the intention 
was to document effects of the program with regard to individual difference variables 
among learners and with regard to a broadly conceived set of outcome measures including 
achievement and attitude insunments. An explicit intent was to modify the instructional 
conditions under which the ICAI system operated and make it more effective. Planned 
experimental comparisons were one option by which these instructional conditions could be 
contrasted. Based on these three major phases (theoretical, instructional, and empirical 
analyses), recommendations for the improvement of this particular project and for the ICAI 
design and development process in general were to be developed. A wide range of 
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evaluation techniques were to be iiicluded, for instance, both quantitative and quahtative 
data collection and analyses. This process is a variant of Figure 2. 

Evaluation Questions. The evaluation questions guiding the study are presented 
below. These questions are a variant of Tables 1, 2, and 3. In each of these, information 
related to the adequacy of the AI components (i.e., knowledge representation, instructional 
strategy, and student model) are treated as appropriate. 

1 . What is the underlying theoretical orientation of the system under evaluation? 
To what extent docs the program serve as a nxxiel for ICAI? 

2 . What instructional strategies and principles are incorpcH-ated into the program? 
To what extent does the project exhibit instructional content and features 
potentially useful to future Army applications? 

3 . What are the learning outcomes for stuuents? To what extent do learners achieve 
project goals? Do students with different background charaaeristics profit 
differentially from exposure to the project? To what extent does the program 
create unanticipated outcomes, either positive or negative? 

Each of these questions was applied to the PROUST ICAI project. 

PROUST: Program Description. The ICAI system entitled PROUST was 
designed by Johnson and Soloway at Yale University. The system tide is a literary 
allusion: Remembrances of Bugs Past, with apologies to the original author. 

PROUST is designed to assist novice programmers to use the PASCAL language in 
their own writing of computer programs. The approach taken is to provide intelligent 
feedback to beginning students about the quality of their efforts in an attempt to 
approximate the feedback that a human tutor might provide. In the words of its designers, 
PROUST is: 

"... a tutoring system which helps novice programmers to learn to program." 

"... a system which can be said to truly understand (buggy) novice programs," 
(Johnson & Soloway, 1983). 

Thus, PROUST is not a trivial effort. The designers have had to map the cognitive 
domain of computer programming, with PASCAL as the specific instance. The evaluated 
implementation (circa 1985) of PROUST permited students to submit their programs in 
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response to two specific (but intended to be prototypic) programming problems. PROUST 
takes as its input programs which have passed through the PASCAL compiler and are 
syntactically correct In analyzing these programs, PROUST attempts to infer students' 
intentions and to identify any mistakes (bugs in their software) that occurred in the code 
(Johnson & Soloway, 1983). 

As an example of a functioning ICAI system, PROUST represents only a panial 
solution for the need to formatively evaluate a complete ICAI system. It contains the 
knowledge representation in software for the prcblemspace of the specific PASCAL 
programming problems. It also contains the diagnostic part of a tutoring component, 
which analyzes the student program to determine both student intentions and bugs. 
PROUST then provides feedback about its inferences about students' intentions and how 
well the student program implements the assumed plans. However, it does not have a 
robust tutor. Currently (circa 1987) underdevelopment is the pedagogical expert, which 
knows how to interact with and instruct (tutor) students effectively, and contains a student 
model to cumulatively monitor student progress. Although it has been anticipated that these 
components would be available for a full test of the ICAI system, schedule constraints 
restricted our activities to the completed components. The Yale project staff attempted to 
include an additional level of feedback in the analyzer as a precursor to the full development 
of the tutor. 

Evaluation Approach. As was discussed previously, for the evaluation of 
PROUST, three sets of questions guided our efforts. The evaluation questions, 
dimensions of inquiry, measurement method, and data sources guiding the study are 
presented ir Table 4. 

Insert Table 4 about here 

Because the questions clearly call for a variety of data collection an analysis, ranging 
from review of documentation, inspection of the program, close observation of outputs 
from the programs, and student performance and self-report information, the procedures in 
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the study were complex. Thus, Table 4 si^mmarizes the instrumentation, data collection, 
and respondents required for aspects of the program under review. 

Formative Evaluation Results. The report by Baker et al. (1985) presents the 
complete description and evaluation of PROUST. There are three major sections their 
document: a theoretical analysis of the program, a formative review, and a report of two 
effectiveness studies conducted with PROUST. As was discussed, the purpose of their 
evaluation was to provide information relevant to the potential improved effectiveness of 
the system. For the purposes of this chapter, we will provide a concise summary of their 
findings. We suggest that their methodological approach and measuring procedures arc 
appropriate for a formative evaluation of ICAI systems in general. 

The theoretical orientation of PROUST is a top-down approach based on intentions 
and plans. Rather than compare the student program to an ideal implementation, PROUST 
compares it to the plan it believes the student was attempting. PROUST inspects a 
student's program and attempts to classify the inferred intentions against a set of 
possibilities based upon prior student approaches. The program's greatest strength is 
perhaps its ability to deal with alternative goal decompositions. Its weakness is that it does 
not explicitly ask the student to confirm the plan that the program "thinks" the student is 
pursuing. 

Because PROUST was only a partial ICAI system, irecommendations for 
improvement focused on two instructional features: type of feedback provided to studenis 
and bug analysis. Suggestions for improving feedback were made, especially the content, 
tone, and learner-control of feedback. Additional reconimendations were made for 
increasing the interactive aspects of PROUSTs implementation through verification of 
student plans, input/output analysis, and student control of timing. In general. Baker et 
al.*s (1985) study showed few significant findings of use of PROUST related to learning 
outcomes. However, students were generally positive about using the program. The 
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designers continue their own evaluation efforts, and Soloway has recently presented 
workshops (circa 1987) on the topic. 

How Can Evaluation Assist ICAI Applications: S me Suggestions 

Tne history of evaluation of ICAI implementations is light reading. For evaluation to 
work to the mutual benefit of appHcation designers and their resource providers, we 
suggest the following: 

1. The expectation of evaluation should be developed in the minds of the ICAI 
developers. The description of the instructional effectiveness of applications needs to 
become part of the socialized ethic, as in science, the expectation of repeatabihty, 
verifiability and public reporting is commonplace. 

2. Rewards for designers* participation in evaluation are necessary. These must be 
over and above the intrinsic value of the evaluation information for the designer. Becau*^ " 
evaluation is not a common expectation, special benefits must be developed to create 
cooperation. 

3. The credibility of the evaluation team must be seriously addressed. AI experts 
need to participate in AI and ICAI evaluations. Their participation needs to depend less on 
frantic persuasion and more on a developed sense of professional responsibility (like 
reviewing for a journal). If the approach taken is fonnative, then the designer can receive 
"help" from friendly reviewers. The goal of evaluation of this sort is to aid in revision 
rather than to render a judgment. 

4. Approaches to evaluation must take account of specific features of ICAI 
development Rather than waiting for the completed development, the evaluation team can 
assist in some decision making related to instruction or utilization. While this sounds easy, 
it depends upon the view that "outsiders" know psychology or performance measurement 
in ways that may be useful to ICAI experts. We need to overcome the "not invented here" 
syndrome. 
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5. Evaluation needs to be componential and focus on the utility of the piece of 
software under development Records of rapid prototyping and redesign need to be 
integrated into the formative evaluation. It is as useful to record the blind alleys as the 
successes. 

6. Evaluation needs to be responsible and responsive. Objectivity must be 
preserved, but at the same tine, those evaluated must not feel victimized. A reasonably 
positive example occuned in the formative evaluation of PROUST (Baker, et al., 1985). 
Among the most interesting phases of that activity was the dialogue following the 
submission of the draft of the report to Dr. Soloway. 'Through an interactive process, the 
evaluation report was strengthened, fuller understanding of the intentions and 
accomplishments of the project staff were developed, and points of legitimate disagreement 
were identified. In all cases, the AI expert was able to present (directly quoted) his point of 
view. The overall outcome was that the faimess of the report was not questioned. 

ARTIFICIAL INTELLIGENCE AND TEST DEVELOPMENT 
Although AI has a number of branches that may have educational implications (e.g., 
work in vision to assist the handicapped student), our interest in this section of our chapter 
will focus on the processes related to the design of expert systems and intelligent computer 
assisted instruction (ICAI) as they may help to improve test design. We believe that this 
technology has enormous implications for the creation of rigorous test materials in the 
future. Expert systems provide an opportunity for specific knowledge domains to be 
identified, structured, and incorporated into computer software, while efforts in cognitive 
science have focused on alternative forms of representing such knowledge accurately and 
completely. 

The expertise of "expert" systems sometinKs comes from comparing the problem 
solving approaches of skilled people and attempting to represent them within the computer, 
thus allowing the computer to perform tasks with equivalent expertise (although often with 
greater speed and reliability). The techniques to represent knowledge developed for AI 
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expert systems could potentially be used in ^lC vexing problems of assuring full content 
representa 'on on tests. Because content of tests (especially those commercially produced) 
varies enonjously in depth, comprehensiveness, and accuiacy (Herman & Cabello, 1983; 
Floden, et al., 1980; Baker & Quellmalz, 1980; Burstein, et aL, 1985), using a knowledge 
representation approach may in itself be a contribution for test development, even witliout 
incorporating it as part of a complex, computer-delivered system. Content sampling, and 
theory in support of it, is an area of continuing weakness in many test development 
activities, particularly those which are locally based. 

Knowledge representation is the core of any ICAi system. It focuses on what is the 
principal database of interest, which is a knowledge base. Since expert systems combine 
the idea of knowledge base and representation with the expert's "wisdom," pertinent issues 
to this ar-a in the testing field are: (1) who are the experts (subject matter specialists, 
teachers, test developers) and (2) what options are available for eliciting and representing 
knowledge in a field. To the first issue, two different approaches have been reported, one 
has the expert create a unique knowledge base relevant to a particular subject matter 
domain. These domains are usually quite narrow (such as particular n icrocircuitry) rather 
than similar to school subject matter (English literature). Thus, the question of extension of 
this approach to real school-based learning is at issue. Another possibility is the use of so- 
called expert tools. EMYCIN, (Heuristic Programming Project, Stanford), ROSDE (Rand 
Corporation), ART (Inference Corporation) and KEF (Intellicorp) are examples of systems 
designed to aid the efficient development the knowledge base without specifying subject 
matter (Richer, 1985). More recentiy, tools have been created for personal computer 
environments, e.g., M-1 (Teknowledge) and NEXPERT. These options may permit 
development of content for tesi and item generation. UCLA is currently exploring the 
feasibilityn of using tools of this sort to represent school subject matter (Baker & Freeman, 
1987). 
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A second concern in AI related to assessment is representing the range of errors for 
diagnostic and instructional improvement purposes. Here, the work on Intelligent 
Computer-Assisted Instruction comes into play. ICAI depends upon the creation of a 
student nKxiel, f representation of the pattern of responses individual students make and a 
comparison of ' ^ their performance to expert problem solving strategies or a bug 
catalogue. The latter is a collection of incorrect procedures or "bugs," particularly as they 
apply to identifying micro errors or larger misconceptions (Johnson & Soloway, 1987). 
We believe this technology may be useful for the generation of wrong answer alternatives. 
Also relevant to this area is how test formats and psychometric quality get into such a 
system. Researchers at the Educational Testing Service (Freedle, 1985) have done some 
ejfplorai ^ry woric on item generation using Al-based environments, presumed to be an 
improvement over non-AI assisted computer generation of test item formats. 

We believe that the next five years will result in research which addresses overall 
how developments in ICAI can support the creation of test development systems. Such 
research will neea tO synthesize the science and application base, estimate the feasibility of 
building all or pieces of such a system, and to create small prototypes. 
The AI Test Developer: a Developmental History 

At UCLA, work began in 1985 on exploring the feasibility of an AI Test Developer 
(Baker, 1985). The original goal for the AI Developer was fairly grandiose. We were 
looking for a technology to decentralize testing - to pull some (but not all) of the 
responsibility of test design and publishing away from large, commercial entities and place 
sufficient testing expertise in the hands of the local educator. The benefits of such a 
syftem vould be large. First, at least some fiaction of school administeied tests would be 
consistent with local views of curriculum and responsive to instructional experiences of 
students. Second, earlier research at UCLA (See, for example, Herman and Dorr-Breme, 
1983; Baker, 1976) suggests that standardized test information is a relatively unused 
commodity in teachers' decision-making practices. However, teachers report that their 
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own tests provide the basis for data-driven instructional decisions. An AI Test Developer 
could provide the needed expertise and efficiency for teachers in the design of their own 
measures. Such a system would obviate the high cost training of teachers in test 
development (See Baker, 1978, Baker, et al, 1980; Rudman, et al, 1980.) Such a system 
should allow local teachers, district administrators and curriculum personnel, state 
managers, and private test developers to create tests that meet local curriculum needs. Such 
a global "expert" would fill-in deficient competencies of personnel, whether in item 
generation, quantitative analyses, or test interpretation. Of most interest are the two ICAI 
features mentioned eariler: the content domain issue and the assessment of student errors. 

Critical components in the test developer. At the outset, the AI Developer 
was conceived as a complex, interacting system. However, a set of practical decisions 
modified the view. First, we decided to use commercially available expert system tools for 
the implementation of the developer. Secondly, we decided to constrain development 
hardware to likely user hardware in the short term (3 to 5 years) and limit ourselves to 
software compatible with personal computers in school districts and schools. Third, with a 
relatively scant set of resources, we decided to explore what expertise (other than the main 
test design function) was needed. Interviews with school district evaluation managers, 
personnel in private test development, and academic experts in achievement measurement 
provided an extensive list of discrete topics. Our focus then shifted from developing an 
integrated, memory eating monster to a set of test expert associates - the Test Expert 
Asssociate System (TEAS) During 1987, the first prototype of TEAS was undertaken with 
the expertise represented or Ronald Hambleton of the University of Massachussets. 
Using the M-1 expert tool, Hambleton dealt with the problem of the reliability of criterion- 
referenced tests .* Following the complete encoding of the rules gleaned from Hambleton, 
the system will be presented a set of problems to solve and its answers will be validated by 
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indq>cndent trials by Hambleton and two other psychometric experts. Then the system will 
be tested by school district personnel in order to document the utility of the format, the 
comprehensiveness of the advice, and their reaction to the system itself. At the same time, 
we carefully tracked time and cost of the design of the TEAS prototype to determine the 
feasibility of subsequent effort 

With a short iag, a second TEAS module is under development. Here it is the intent 
to attempt to represent a portion of school subject matter in order to determine whether it 
can be used as a generation context for test items. We have selected speeches from 
American History, particularlv the Lincoln-Douglas Debates. We are interested in v/hether 
the original idea of the test developer (as an item generator) can be implemented in a low- 
cost enviromnent. We are also interested in seeing whether we can find a way to use the 
TEAS component to help us generate criteria for adequate student essay responses, another 
critical measurement problenL The TEAS woric is in process and will undoubtedly be 
affected by advances in software, predisposition to technology use, and research in 
cognitive science. An area of intense interest for us will be the future developments in 
natural language interfaces and understanding. To the extent the natural language field 
matures, testing may become less circumscribed, constrained, and formal and its 
development more distributed. We still feel we have the right goal (although, like ICAI 
designers, we view it as a context rather than a product to be engineered) — the development 
of a system that uses school subject matter knowledge bases, a system that could be 
standardized and shared. Assesssment devices would grow from these knowledge bases 
and might differ in symboli*^ .^presentation presented or elicited from the learner and 
capiuUize on student individual differences. 
Conclusion 

We have attempted to take a Janus view — of the ICAI field on the one hand and 
measurement and evaluation on the other. We have described how evaluation and 
measurement might be useful to the improvement of ICAI design and function and have 
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provided the few examples from our own work. We have also discussed new work in 
progress on the application of AI technology (TEAS) for the intermediate good of 
educational quality, as a resource to improve the measurement of achievement. Neither of 
these areas, either ICAI or Al-based measurement has a secure future. They may merely be 
side ti ps on a longer, more important educational journey. Of importance, however, is to 
analyze the processes involved in their development, and keep Lhe good ideas. By taking 
both critical and empirical perspectives, we may be able to find productive, perhaps 
technological ways to our diverse educational goals. 
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Figure 1 

DESIRABLE PROPERTffiS OF A HUMAN TUTOR 

• The tutor causes the problem solving heuristics of the student to converge to 
those of the tutor. 

• The tutor chooses appropriate examples and problems for the student. 

• The tutor can work arbitrary examples chosen by the student. 

• The tutor is able to adjust to different student backgrounds. 

• The tutor is able to measure the student's progress. 

• The tutor can review previously learned material with the student as the need 
arises. 

Adapted from Gamble & Page (1980) 
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Figure 2 

FORMATIVE EVALUATION ACTIVITY 
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1 . Check ICAI design again its specifications. 

2 . Check validity of instructional strategies in tutor with research literature. 

3 . Conduct feasibility review with instructor 

4. Conduct feasibility test with student(s) 

• one-on-one testing 

• small group testing 

4. Assess instructional effectiveness. 

• cognitive 

• affective 

5 . Assess unanticipated outcomes. 

6. Conduct revision. 



HGURE 3 
ICAI METAPHORS 



SETTING 

Laboratory 

Classroom 
Arcade 

Workbench 

Expert System or 
Automated Job 
Performance Aid 



STUDENT ROLE 
Applied scientist 

Learner 
Game player 

Troubleshooter 



Human System 
Component 



EVALUATION FOCUS 

Problem solving ability 
increased 

Learning increased 

Enjoyment and learning 
increased 

Ability to fix faults increased 



System goal achieved 
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TABLE 1 
EVALUATION QUESTIONS 



I. Are the measures and procedures planned and used for formative and 
summative evaluation providing a fair t^st of the ICAI system? 

II. Does the ICAI system meet its multiple goals? 

a. Generalization 

1 . Does the prototype provide the desired level of education/training? 

2. Is this level maintained or improved as the prototype addresses more 
complex education/training missions; greater numbers of students; 
distributed sites? 

3 . Will the prototype easily generalize (or adapt) to other content areas 
(e.g., algebra to English)? 

b. Technology Push 

1 . Does the development of the existing hardware/software components for 
the system (e.g., knowledge representation, graphics) contribute to the 
capability for future education/training? 

2. Have other technological approaches to educition/training (e.g., 
metacognitive skill training) been considered and integrated into planned 
future prototype? 

c . Unplanned Outcomes (Side-effects Analysis) 

1 . Does the system create requirements to train teachers for new role (e.g., 
expert remediator)? 

2. Will intensive data collection systems permit answers to "old" 
questions, e.g., relative value of discovery learning, estimatio of 
' ansfer both near and far? 

3 . Is the prototype a good environment to vaLdate analytic techniques to 
predict the education/training effectiveness? 

4. >Vill intensive data collection permn answers to "new" questions from 
cognitive science (e.g., analysis of misconceptions or bugs; differences 
between experts and novices; role of mental models in proficiency)? 
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TABLE 2 

CAI LESSON EVALUATION TECHNIQUES 



QUALITY REVIEW 

Check the language and grammar [e.g., appropriate reading level]. 

Check the surface features [e.g., uncluttered displays]. 

Check questions and menus [e.g., making a choice is clear]. 

Check all invisible functions [e.g. appropriate student records kept]. 

Check the subject matter content [e.g., information is accurate]. 

Check the off-line material [e.g., directions in operator manual are clear]. 

Revise the lesson. 

Apply the same quality-review procedure to all levisions. 

PILOT TESTING 

Enlist about three helpers [i.e., representative of potential students]. 
Explain pilot- testing procedures [e.g., encourage note-taking]. 
Find out how much they know about the subject matter. 
Observe them go through the lesson. 
Interview them afterwards. 
Revise the lesson. 
Pilot-test all revised lessons. 

VALIDATION 

Use the lesson in the setting for which it was designed. 

Use the lesson with students for which it was designed. 

Evaluate how the students perform in the setting for v/hich you are preparing 

them. 

Obtain as much performance data as you can ft*om different sources. 
Obtain data on student achievement attributable to the lesson. 
Obtain data on student attitudes toward the lesson. 



Adapted from Alessi & Trollip (1985), p. 393 
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TABLES 
AI FEATURES IN ICAI SYSTEMS 



TOPIC 

Knowledge representation techniques 
Reasoning mechanisms 
Development environment 

Rapid prototypes 



Student modeling methods 
Knowledge acquisition techniques 



Validation tools 



Cost factors 



Expert tutor 



Cognitive or process model 



Languages 



EXAMPLES 

Production rules, frames, 
networks 

Backward and forward chaining 
inheritance 

User-interface, editors and 
debuggers, documentation and 
on-line help systems 

Rapidly developed simulation, 
exhibit functionality, couvey 
requirements; not meant to be 
operational systems 

Overlay, buggy, 
individual differences 

"Shells," knowledge- base 
elicitors 

check integrity of knowledge 
base to identify conflicting 
rules or syntactical errors 

Price of software, support, 
training, required hardware, 
skilled personnel 

Domain-independent instructional 
strategies 

Model of how system 
accomplishes its tasks, 
may be based on models of 
human reasoning (e.g., schema) 

LISP, PROLOG 



